Multiple Clustering Views from Multiple Uncertain Experts
نویسندگان
چکیده
Expert input can improve clustering performance. In today’s collaborative environment, the availability of crowdsourced multiple expert input is becoming common. Given multiple experts’ inputs, most existing approaches can only discover one clustering structure. However, data is multi-faceted by nature and can be clustered in different ways (also known as views). In an exploratory analysis problem where ground truth is not known, different experts may have diverse views on how to cluster data. In this paper, we address the problem on how to automatically discover multiple ways to cluster data given potentially diverse inputs from multiple uncertain experts. We propose a novel Bayesian probabilistic model that automatically learns the multiple expert views and the clustering structure associated with each view. The benefits of learning the experts’ views include 1) enabling the discovery of multiple diverse clustering structures, and 2) improving the quality of clustering solution in each view by assigning higher weights to experts with higher confidence. In our approach, the expert views, multiple clustering structures and expert confidences are jointly learned via variational inference. Experimental results on synthetic datasets, benchmark datasets and a real-world disease subtyping problem show that our proposed approach outperforms competing baselines, including meta clustering, semisupervised clustering, semi-crowdsourced clustering and consensus clustering.
منابع مشابه
Supplementary Material: Multiple Clustering Views from Multiple Uncertain Experts
For variational inference in our approach, we use the following parameter settings 1. G, the number of components in truncated Dirichlet Process, is set to be M/2, where M is the total number of experts. In this way, we try to enforce the constraint that on average, there should be at least two experts in each view. In all experiments, the number of expert views recovered by our approach is sma...
متن کاملClustering from Multiple Uncertain Experts
Utilizing expert input often improves clustering performance. However in a knowledge discovery problem, ground truth is unknown even to an expert. Thus, instead of one expert, we solicit the opinion from multiple experts. The key question motivating this work is: which experts should be assigned higher weights when there is disagreement on whether to put a pair of samples in the same group? To ...
متن کاملIntegration of Single-view Graphs with Diffusion of Tensor Product Graphs for Multi-view Spectral Clustering
Multi-view clustering takes diversity of multiple views (representations) into consideration. Multiple views may be obtained from various sources or different feature subsets and often provide complementary information to each other. In this paper, we propose a novel graph-based approach to integrate multiple representations to improve clustering performance. While original graphs have been wid...
متن کاملHybrid Hierarchical Clustering: Forming a Tree From Multiple Views
We propose an algorithm for forming a hierarchical clustering when multiple views of the data are available. Different views of the data may have different underlying distance measures which suggest different clusterings. In such cases, combining the views to get a good clustering of the data becomes a challenging task. We allow these different underlying distance measures to be arbitrary Bregm...
متن کاملCo-regularized Multi-view Spectral Clustering
In many clustering problems, we have access to multiple views of the data each of which could be individually used for clustering. Exploiting information from multiple views, one can hope to find a clustering that is more accurate than the ones obtained using the individual views. Often these different views admit same underlying clustering of the data, so we can approach this problem by lookin...
متن کامل